Lifelog Scene Change Detection Using Cascades of Audio and Video Detectors
نویسندگان
چکیده
The advent of affordable wearable devices with a video camera has established the new form of social data, lifelogs, where lives of people are captured to video. Enormous amount of lifelog data and need for on-site processing demand new fast video processing methods. In this work, we experimentally investigate seven hours of lifelogs and point out novel findings: 1) audio cues are exceptionally strong for lifelog processing; 2) cascades of audio and video detectors improve accuracy and enable fast (super frame rate) processing speed. We first construct strong detectors using state-of-the-art audio and visual features: Mel-frequency cepstral coefficients (MFCC), colour (RGB) histograms, and local patch descriptors (SIFT). In the second stage, we construct a cascade of the trained detectors and optimise cascade parameters. Separating the detector and cascade optimisation stages simplify training and results to a fast and accurate processing pipeline.
منابع مشابه
Compressed Domain Scene Change Detection Based on Transform Units Distribution in High Efficiency Video Coding Standard
Scene change detection plays an important role in a number of video applications, including video indexing, searching, browsing, semantic features extraction, and, in general, pre-processing and post-processing operations. Several scene change detection methods have been proposed in different coding standards. Most of them use fixed thresholds for the similarity metrics to determine if there wa...
متن کاملUnsupervised Emotional Scene Detection from Lifelog Videos Using Cluster Ensembles
An emotional scene detection method is proposed in order to retrieve impressive scenes from lifelog videos. The proposed method is based on facial expression recognition considering that a wide variety of facial expression could be observed in impressive scenes. Conventional facial expression techniques, which focus on discriminating typical facial expressions, will be inadequate for lifelog vi...
متن کاملRUCMM at MediaEval 2015 Affective Impact of Movies Task: Fusion of Audio and Visual Cues
This paper summarizes our efforts for the first time participation in the Violent Scene Detection subtask of the MediaEval 2015 Affective Impact of Movies Task. We build violent scene detectors using both audio and visual cues. In particular, the audio cue is represented by bag-of-audio-words with fisher vector encoding. The visual cue is exploited by extracting CNN features from video frames. ...
متن کاملFire detection using video sequences in urban out-door environment
Nowadays automated early warning systems are essential in human life. One of these systems is fire detection which plays an important role in surveillance and security systems because the fire can spread quickly and cause great damage to an area. Traditional fire detection methods usually are based on smoke and temperature detectors (sensors). These methods cannot work properly in large space a...
متن کاملScene change detection by audio and video clues
Automatic video scene change detection is a challenging task. Using audio or visual information alone often cannot provide a satisfactory solution. However, how to combine audio and visual information efficiently still remains a difficult issue since there are various cases in their relationship due to the versatility of videos. In this paper, we present an effective scene change detection meth...
متن کامل